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Abstract 

Coordinated  behavior  of  mobile  robots  is  an  important 
emerging  application  area.  Different  coordinated  behav¬ 
iors  can  be  achieved  by  assigning  sets  of  control  tasks,  or 
strategies,  to  robots  in  a  team.  These  control  tasks  must  be 
scheduled  either  locally  on  the  robot  or  distributed  across 
the  team.  An  application  may  have  many  control  strategies 
to  dynamically  choose  from,  although  some  may  not  be  fea¬ 
sible,  given  limited  resource  and  time  availability.  Thus, 
dynamic  feasibility  checking  becomes  important  as  the  co¬ 
ordination  between  robots  and  the  tasks  that  need  to  be  per¬ 
formed  evolves  with  time.  This  paper  presents  an  online  al¬ 
gorithm  for  finding  a  feasible  strategy  given  a  functionally 
equivalent  set  of  strategies  for  achieving  an  application ’s 
goals. 

We  present  two  heuristics  for  feasibility  checking.  Both 
consider  communication  cost  and  utilization  bound  to  make 
allocation  (of  tasks  to  execution  sites)  and  scheduling  de¬ 
cisions.  Extensive  experimental  results  show  the  effective¬ 
ness  of  the  approaches,  especially  in  resource-tight  environ¬ 
ments.  We  also  demonstrate  the  application  of  our  approach 
to  real-world  scenarios  involving  teams  of  robots  and  show 
how  feasibility  analysis  also  allows  the  prediction  of  the 
scalability  of  the  solution  to  large  robot  teams. 

Keywords:  Distributed  real-time  systems,  allocation, 
schedulability,  precedence  constraint 

1  Introduction 

A  promising  application  for  a  team  of  mobile  robots  is  to 
collaborate  with  each  other  to  accomplish  a  common  goal, 
for  example,  searching  a  burning  building  for  trapped  peo¬ 
ple.  Human  operators  may  direct  the  search  by  teleopera¬ 
tion,  but  wireless  communications  in  these  situations  can  be 
unreliable.  When  a  search  robot  ventures  outside  a  reliable 
communication  range,  a  second  robot  can  autonomously 
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create  a  network  to  preserve  quality  of  service  between  the 
operator  and  the  search  robot.  The  lead  robot  can  thus  pene¬ 
trate  further  into  the  rubble  at  the  expense  of  communication 
latencies  and  distributed  control  overhead.  One  instantia¬ 
tion  of  such  a  strategy  constructs  a  series,  kinematic  chain  of 
mobile  robots  where  each  of  them  actively  preserves  Line- 
Of-Sight  (LOS)  [27]  and  intra-network  bandwidth.  In  the 
simplest  case,  pairwise  coordinated  controllers  were  devel¬ 
oped  for  a  team  of  two  robots.  One  controller,  denoted  pull, 
allows  a  leader  robot  to  search  an  area  while  “pulling”  a  fol¬ 
lowing  robot  behind  it.  The  other  controller,  push,  allows  a 
follower  to  specify  the  search  area  of  the  leader,  in  effect, 
“pushing”  the  leader  along.  The  application  constructs  a 
strategy  by  assigning  push  or  pull  controllers  to  the  entire 
team.  The  task  models  for  these  two  strategies,  namely,  the 
push  and  pull  controllers  themselves,  are  shown  in  Figure  1. 
In  the  two  task  graphs,  sensor  and  motor  tasks  I R,,  POSi, 
and  Mi  are  preassigned  to  specific  execution  sites,  while 
three  control  tasks  Hi,  II-2,  and  L2  may  reside  on  either 
team  member,  if  necessary,  to  optimize  processor  utiliza¬ 
tion  or  communication  costs.  The  Hi  and  H2  tasks  are  used 
by  the  robots  to  determine  current  search  and  LOS  areas, 
respectively,  while  L2  is  used  for  coordinating  the  desired 
movements  of  the  robots  so  that  LOS  is  maintained  and  the 
search  can  make  progress.  The  functionality  of  the  team 
is  not  affected  by  changing  the  allocations  of  the  control 
tasks.  The  differences  between  push  and  pull  can  be  seen  at 
the  task  graph  level.  For  example,  with  push,  the  communi¬ 
cation  Hi  -A  L  )  specifies  to  the  leader  which  areas  it  may 
search,  while  with  pull,  L2  —>  Mi  tells  the  follower  where 
it  may  move. 

A  discussion  of  how  applications  generate  possible 
strategies  is  beyond  the  scope  of  this  paper.  Usually,  appli¬ 
cations  determine  the  required  type  of  coordinated  behavior 
for  a  team  of  n  robots,  and  generate  a  set  of  functionally 
equivalent  strategies.  Since  each  strategy  is  constructed  by 
periodic  real-time  tasks,  a  strategy  that  is  valid  at  the  appli¬ 
cation  level  may  not  always  b t  feasible  at  the  system  level. 
How  to  find  feasible,  schedulable,  strategies  from  a  a  set  of 
functionally  equivalent  strategies,  given  by  the  application. 
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Figure  1 .  Tasks  in  a  Leader/Follower  team 

is  one  goal  of  this  work. 

The  team  size  is  not  fixed,  so,  as  robots  enter  or  leave  the 
team,  the  application  must  recompute  the  set  of  strategies 
that  can  be  used.  As  the  team  size  changes,  the  application 
may  determine  that  the  goal  of  the  behavior  must  change  as 
well,  which,  in  turn,  also  changes  the  set  of  correct  strate¬ 
gies  that  can  be  used.  Figure  2  shows  a  sequence  from  a 
simulation  with  five  robots  using  push  and  pull  controllers, 
where  robot  0  is  the  leader  searching  for  the  goal  which  is 
the  square  in  the  lower  left  of  the  map.  Each  time  a  robot 
joins  the  team  and  the  set  of  possible  strategies  changes, 
the  application  must  run  the  on-line  scheduling  algorithm 
to  determine  which  strategies  are  feasible;  otherwise,  a  sys¬ 
tem  failure  may  occur.  The  control  tasks  that  make  up  a 
strategy  may  be  distributed  among  sites  in  a  team,  such  as 
Hi,  i?2  and  L  >  in  the  push  and  pull  models.  A  goal  of  our 
work  is  to  determine  the  assignment  of  tasks  to  sites  in  order 
to  optimize  the  coordinated  behavior  while  minimizing  the 
communication  overhead  and  workload  at  each  site,  thereby 
improving  overall  schedulability. 

As  shown  in  the  task  graphs  of  push  and  pull,  communi¬ 
cation  between  tasks  is  needed  to  achieve  coordination.  In 
this  paper,  we  assume  that  each  robot  is  equipped  with  wire¬ 
less  broadcast  communication.  Because  a  shared  multiple- 
access  medium  is  used  in  the  system,  contention  for  the 
communication  medium  can  occur  at  run  time.  We  avoid 
this  contention  by  scheduling  the  communication  as  well. 

Assigning  tasks  with  precedence  relationships  in  a  dis¬ 
tributed  environment  is  in  general  an  NP-hard  problem  [17], 
and  even  some  of  the  simplest  scheduling  problems  are  NP- 
hard  in  the  strong  sense  [7].  Systematically  derived  heuris¬ 
tic  allocation  and  scheduling  algorithms  are,  therefore,  pro¬ 
posed  and  evaluated  in  this  paper. 

The  contributions  of  this  paper  are  as  follows.  We  de¬ 
velop  an  online  algorithm  for  finding  a  feasible  control 
strategy  given  a  functionally  equivalent  set  of  strategies 
for  achieving  an  application’s  goals.  Specifically,  we  pro¬ 
pose  two  simple  but  efficient  heuristics  for  allocating  con¬ 
trol  tasks  to  distributed  processing  entities,  which  aim  to 
improve  the  overall  schedulability  by  minimizing  commu¬ 
nication  costs  and  utilization  of  processors.  We  have  per¬ 
formed  extensive  evaluations  of  the  algorithms  and  also  ex¬ 
ercised  it  using  a  case  study  of  a  real  world  example  from 
mobile  robotics  to  achieve  a  simple  but  efficient  allocation 
and  communication  scheme  for  a  team  of  robots. 


The  rest  of  the  paper  is  structured  as  follows.  In  Sec¬ 
tion  2,  the  system  model  and  goal  are  described.  The  details 
of  the  allocation  and  scheduling  algorithms  are  provided  in 
Section  3.  Results  of  evaluations  from  simulation  are  pro¬ 
vided  in  Section  4.  Section  5  analyzes  a  real-word  robotic 
application.  Section  6  discusses  related  work.  Section  7 
concludes  the  paper  by  summarizing  the  important  charac¬ 
teristics  of  the  algorithm  and  discusses  future  work. 

2  System  Model  and  Our  Goal 

2.1  System  and  Task  Model 

A  coordinated  team  consists  of  a  set  of  sites  (robots), 
each  site  having  an  identical  processor.  In  this  paper,  we  use 
site  and  processor  interchangeably.  Robots  in  a  team  share 
a  communication  medium,  which  allows  broadcast  commu¬ 
nication  between  robots.  To  prevent  contention,  communi¬ 
cation  is  also  prescheduled. 

A  strategy,  which  is  specified  at  the  application  level,  is 
denoted  at  the  system  level  by  an  acyclic  Task  Graph  (TG). 
To  achieve  a  common  goal  for  the  team,  a  set  of  functionally 
equivalent  strategies  may  also  be  supplied  by  applications. 

In  a  TG,  nodes  represent  tasks  (T,),  directed  edges  be¬ 
tween  tasks  represent  precedence  (e.g.,  producer/consumer) 
relationships.  The  amount  of  communication  is  denoted  as 
a  communication  cost  attached  to  the  edges.  All  tasks  in 
our  model  are  periodic.  Each  task  is  characterized  by  a  pe¬ 
riod  Pi,  Worst  Case  Execution  Time  (WCET)  Ci ,  and  rela¬ 
tive  deadline  Dj,  here,  D,  =  Pt.  Periods  can  be  different 
for  different  tasks.  But  if  the  producer  and  the  consumer  run 
with  arbitrary  periods,  task  executions  may  get  out  of  phase, 
which  results  in  large  latencies  in  communication  [21].  Har- 
monicity  constraints  can  simplify  the  reading/ writing  logic 
and  reduce  those  latencies  [20].  Harmonic  periods  may  also 
increase  the  feasible  processor  utilization  bound  [25],  To 
this  end,  we  assume  the  period  of  the  consumer  is  a  multi¬ 
ple  of  that  of  the  related  producer. 

2.2  Our  Goal 

Given  a  set  of  sites  and  a  set  of  functionally  equivalent 
strategies,  our  goal  is  to  find  a  feasible  strategy.  A  strategy 
is  feasible  if  and  only  if: 

•  within  the  LCM  (Least  Common  Multiple)  of  task  pe¬ 
riods  in  that  strategy,  each  instance  of  a  task  is  sched¬ 
uled  at  its  schedule  start  time  and  the  completion  of 
this  instance  will  not  be  later  than  its  relative  deadline; 

•  all  constraints,  such  as  precedence,  are  satisfied. 

Based  on  the  nature  of  the  application,  some  tasks,  e.g,  sen¬ 
sor  and  motor  systems,  are  required  to  run  on  designated 
sites,  e.g.,  a  specific  robot  platform.  Other  tasks,  however, 
can  be  assigned  to  any  site  in  a  team.  To  find  a  feasible 
strategy,  the  system  needs  to: 


Figure  2.  A  sequence  of  active  robots  in  a  robot  team 


1 .  assign  unallocated  tasks  to  appropriate  sites; 

2.  determine  a  schedule  for  all  task  instances. 

3  Allocation  and  Scheduling  Algorithms 

We  now  give  the  details  of  the  allocation  and  scheduling 
algorithms.  The  notation  used  in  this  paper  is  explained  in 
Table  1,  where  CCRij  is  defined  as: 

communication  -cost  ( Tj  — >■  Tj ) 


Notation 

Meaning 

Ti 

Task  ID 

a 

Worst  Case  Execution  Time  (WCET)  of  task  Tj 

A" 

Deadline  of  the  nth  instance  of  Ti 

E-1 

Earliest  start  time  of  the  nth  instance  of  Ti 

Si 

Site  ID 

Ui 

Utilization  of  the  site  Si 

«? 

Utilization  of  the  site  Si  that  T*  is  on 

T  ->  Tj 

Precedence  constraint  between  Tj  and  Tj 

CCRi,j 

Communication  Cost  Ratio  of  Tj  — >  Tj 

Table  1.  Notation  used  in  this  paper 


3.1  Allocation  Algorithm 

Optimal  assignment  of  real-time  tasks  to  distributed  pro¬ 
cessors  is  an  intractable  problem.  Two  resource-bounded 
iterative  heuristics  are  proposed  in  this  section.  Based  on 
utilization  of  each  processor  and  the  amount  of  communica¬ 
tion  between  tasks,  the  heuristics  attempt  to  minimize  work¬ 
load  of  each  processor,  as  well  as  the  total  communication. 
A  dynamic  utilization  threshold  is  used  at  each  step  in  both 
heuristics.  The  function  of  this  threshold  is  to:  1)  balance 
and  minimize  the  workload  of  each  processor;  2)  avoid  the 
violation  of  utilization  bound  for  schedulability  purpose. 

3.1.1  Greedy  Heuristic 

This  heuristic  considers  the  amount  of  communication  and 
computation  involved  for  each  pair  of  producer  and  con¬ 
sumer  tasks.  A  decision  is  made  as  to  whether  these  two 


tasks  should  be  assigned  to  the  same  processor,  thereby 
eliminating  the  communication  cost.  At  each  step,  for  all 
allocated  tasks  and  related  unallocated  successors,  the  al¬ 
gorithm  selects  an  unallocated  task  Ty  that  has  the  largest 
CCRX)V,  where  Tx  is  located  on  site  Sk  and  Tx  —>  Ty. 
The  algorithm  then  attempts  to  allocate  Ty  to  Sk ,  based  on 
whether  or  not  the  utilization  of  Sk  becomes  larger  than  the 
threshold  t.  If  the  utilization  is  not  larger  than  t,  Ty  is  as¬ 
signed  to  Sk  and  t  needs  not  change;  otherwise,  the  algo¬ 
rithm  tries  to  find  a  site  Si  that  currently  has  the  least  uti¬ 
lization,  and  attempts  to  assign  Ty  to  Si.  In  this  case,  t  may 
need  to  be  updated.  If  a  proper  processor  can  be  found,  the 
algorithm  continues  with  the  next  unallocated  task  that  has 
the  largest  CCR  among  the  remaining  unallocated  tasks, 
using  the  new  threshold.  If  there  is  no  task  left  to  be  as¬ 
signed  and  the  workload  of  every  processor  is  less  than  1, 
the  algorithm  is  deemed  successful;  if  the  algorithm  chooses 
a  task  to  be  allocated,  but  no  site  can  be  found  (because  any 
processor’s  utilization  will  be  larger  than  1  after  loading  this 
task),  the  algorithm  fails.  The  pseudo-code  for  the  Greedy 
allocation  algorithm  is  shown  in  Table  3. 

The  basic  idea  of  updating  the  threshold  t  is  to  use  an 
increasing  limit  on  utilization.  Initially,  t  is  the  maximum 
value  of  the  utilizations  of  all  processors  to  which  preallo¬ 
cated  tasks  have  been  assigned.  At  the  times  when  the  algo¬ 
rithm  has  to  find  a  site  Si  with  the  least  utilization,  so  it  can 
assign  Ty  to  Si,  several  conditions  need  to  be  considered. 
Suppose  t  is  the  utilization  if  Ty  is  assigned  to  Sk,  and  ul 
is  the  utilization  if  Ty  is  assigned  to  Si .  First,  if  L  >  1, 
in  this  case,  if  site  l  ^  k  and  ut  <  1,  then  the  task  can  be 
assigned  to  Si  and  the  threshold  is  updated  to  max(t,Ui); 
otherwise,  no  processor  can  be  found  for  loading  Ty  since 
all  utilizations  will  be  larger  than  1.  Second,  if  t  <1,  then 
Ty  is  assigned  to  Si,  but  t  is  updated  to  t'  since  this  is  the 
expected  lowest  value  since  the  last  time.  The  pseudo-code 
for  the  function  updating  the  threshold  is  shown  in  Table  4. 
The  returned  value  is  either  the  new  threshold  if  it  finds  a 
location,  or  —1  if  it  does  not. 

3.1.2  Aggressive  Heuristic 

To  address  the  motivation  behind  this  heuristic,  let  us  use  a 
simple  task  graph  depicted  in  Figure  3,  for  which  WCETs 
and  periods  are  given  in  Table  2.  Consider  the  commu- 


Site(T1)=S1  Site(T2)=S1  Site(T3)=S2 

Cl  =  2  C2  =  2  C3  =  1 


Figure  3.  A  simple  task  graph  example 


Task 

Ti 

t2 

Tz 

t4 

t5 

WCET  (Ci) 

2 

2 

1 

2 

8 

Period  (Pi) 

10 

20 

2 

20 

40 

Table  2.  Parameters  for  tasks  in  Figure  3 


nications  to  task  T4.  T4  is  assigned  to  site  So  in  the 
Greedy  algorithm  because  CCRz,a  is  larger  than  CCRia 
and  CCR2,a-  However,  we  notice  the  accumulated  com¬ 
munication  cost  from  Si  is  larger  than  that  from  So,  he., 
(CCRi}4  +  CCR2,a)  >  CCRz,a-  So,  intuitively,  it  is  bet¬ 
ter  to  assign  T4  to  Si  instead  of  So. 

The  second  heuristic  we  propose  takes  into  account  the 
total  communication  from  the  same  site  to  see  which  un¬ 
allocated  task  has  the  largest  accumulated  CCR,  and  then 
selects  this  task  to  be  considered  next.  Since  the  utilization 
bound  is  still  needed  to  be  considered,  once  the  task  is  se¬ 
lected,  the  assignment  and  threshold  updates  are  the  same  as 
in  the  Greedy  heuristic.  To  this  end,  line  5  in  Table  3  needs 
to  be  changed  to:  Initialize  R  =  {R^\Ty  €  N},Ry  = 
YjiCCRi,v,Ti  e  F,Ti  ->•  Ty ,  Site(Tj  =  Sx;  and  line  12 
is  changed  to:  R  =  R  \  {Ry}  U  {Rz},\/Tz  e  N,Ty  Tz. 
Following  the  Aggressive  algorithm,  T4  is  first  selected  and 
assigned  to  Si,  and  then  T5  is  also  assigned  to  Si,  since  Si 
currently  has  the  least  utilization,  at  0.4.  This  slightly  more 
complicated  algorithm  is  shown  more  effective  with  regards 
to  higher  schedulability  in  Section  4. 

3.2  Making  Scheduling  Decisions 

After  a  successful  assignment  is  found,  a  schedule  is 
needed  for  each  instance  of  tasks.  Before  we  discuss  the 
algorithm,  first,  let  us  define  some  terminology. 

•  Earliest  start  time  The  earliest  start  time  of  an  in¬ 
stance  of  a  task  is  derived  from  the  precedence  con¬ 
straints.  Let  L  be  the  LCM  of  task  periods.  If  task 
Ti  has  no  predecessors,  the  first  instance  is  ready  to 
execute  at  time  0,  denoted  as  Ej  =  0;  and  for  the 
nth  instance  of  that  task,  E'1  =  (n  —  1)  x  Pj,  where 
1  <  n  <  Ni,Ni  =  L/Pi.  If  Ti  has  predecessors,  its 
first  instance  becomes  enabled  only  when  all  its  prede¬ 
cessors  have  completed  execution.  In  order  to  achieve 
this  condition,  the  tasks  in  the  original  task  graph  are 
topologically  ordered.  When  a  task  T,  is  processed,  the 
lower  bound  of  Ej  is  set  to  max(Ej ,  El  +  Ck ) ,  where 


Greedy  Allocation  Algorithm 

Input:  a  task  graph  G  =  (E,  V) ;  Pi ,  Ci  for  each  task  Ti ; 
communication  costs;  preallocated  tasks  with  related  sites;  the  number  of  sites  m 
Output:  an  assignment  to  all  unallocated  tasks  such  that  utilization  of  each 
processor  is  less  than  1 

Variables: 

F  :  the  set  of  tasks  that  have  been  allocated; 

N :  the  set  of  tasks  that  have  not  been  allocated 
R:  the  set  of  communication  cost  ratios" 

U :  array  of  utilizations  (workload) 
t:  threshold  of  utilization 

.  communication-cost(Ti  —>Tj  ) 

CCRij :  communication  cost  ratio  of - C  ■  -\-C  ■ - — 

Algorithm  3.1: 

1.  Initialize  U  =  {y,i  |i  =  1,2 , ...,  ra},  such  that  for  each  processor  Si : 

Ui  —  -p^-j  Tj  G  F  A  Site(Tj )  =  Si  ; 

2.  Let  t  =  max ( Ui ),Ui  G  U, 

/*  t  is  the  threshold  for  workload  control,  initialize  the  threshold  */; 

3.  If  (f  >  l).do 

4.  exit  without  solution; 

5.  Initialize  R  =  { CCRx,y\Tx  G  F,Ty  G  N},TX  ->•  Ty\ 

6.  While  ( N  is  not  empty)  do 

7.  Find  such  task  Ty  that  has  the  maximum  value  CCRx,y  out  of  R\ 

8.  Let«fc  =  (uk  +  5^-); 

/*  Site(Tx)  =  5fe,  calculate  the  new  utilization,  if  Ty  is  allocated  to  site  k  */ 

9.  If((£  =  thresholdUpdate(uk,  k,Ty))  <  0),  do; 

10.  exit  without  solution;  /*  cannot  find  an  appropriate  site  */ 

11.  Update  set  F,  N  such  that  F  =  F  U  {Ty  },  N  =  N  \  {Ty  }; 

12.  Update  set  R,  such  that  R  =  R  \  {CCRx,y}  U  {CCRy,z  }, 

VT*  G  F,  Tz  G  N(TX  Ty )  A  (Ty  -V Tz)  . 

flWe  use  the  same  notation  R  to  express  different  functionalities  in 
two  heuristics  algorithms 


Table  3.  Greedy  allocation  algorithm 


Function  of  Assignment  and  Threshold  Update 
float  thresholdUpdate(float  t  ,  int  k,  Task  Ty ) 

/*  t  is  the  threshold;  Ty  and  processor  k  is  selected;  t  is  the  new  workload  if 
assigning  Ty  to  k  */ 

1.  Case  lit  <  t,  do  /*  t  is  less  than  the  threshold  t  */ 

2.  Assign  task  Ty  to  processor  k ; 

3.  Update  U  with  the  new  utilization  u,k  =  t  ; 

4.  Case  2:  t  >  t ,  do  /*  t  is  larger  than  the  threshold  t  */ 

5.  Find  the  processor  l  that  has  least  utilization  m  =  min(ui ),  m  G  U, 

letuj  «  «i  + 

/ 

6.  Case  2.1:  t  >  1.  do  /*  processor  k  cannot  load  Ty  */ 

7.  If  (Z  ^  k)  A  (ut  <  1),  do 

8.  Allocate  task  Ty  to  processor  Z; 

9.  Update  U  with  the  new  m  =  ul ; 

10.  t  =  max{t ,  ut ); 

11.  Elseretum-1; 

/*  (Z  =  k)  V  (ul  >  1),  cannot  find  a  processor  to  load  Ty  */ 

12.  Case  2.2:  t  <  1,  do  /*  u[  <  t  <1  */ 

13.  Allocate  task  Ty  to  processor  Z; 

14.  Update  U  with  the  new  m  =  ul  \ 

15.  t=t\ 

16.  Return  t  \  /*  new  threshold  */ 


Table  4.  Assignment  and  threshold  updates 


Vfe,Tfc  €  Predecessor s(Ti).  Since  we  will  model 
communication  as  a  task  if  the  producer/consumer 
tasks  are  on  different  sites,  and  we  have  harmonicity 
constraints  for  all  such  pairs,  initially,  the  lower  bound 
of  E "  is  assigned  to  (n  —  1)  x  Pt  +  Ej. 


•  Communication  task  If  the  producer  and  consumer 
are  allocated  on  the  same  site,  the  communication 
cost  is  avoided;  otherwise,  communication  needs  to  be 
scheduled.  We  model  this  as  a  communication  task. 
For  T-i  —f  Tj,  a  communication  task  Tcomm  has  fol¬ 
lowing  features. 

1.  PCOmm  =  Pj-  This  is  because  each  instance  of 
Tj  needs  to  process  data  sent  from  T,  only  once 
during  one  period  of  Tj ; 

2.  Ecomm  =  (tt  —  1)  x  Pcornm  +  Ej.  This  is  a 
lower  bound  because  the  instance  should  begin 
execution  at  least  after  the  completion  of  the  first 
instance  of  Tp, 

3.  D™omm  =  n  x  Pcomm  -  Cj.  This  is  an  upper 
bound  since  the  communication  should  finish  its 
execution  no  later  than  the  latest  start  time  of  Tj. 

Within  the  LCM ,  each  instance  of  tasks  is  treated  as  an 
individual  entity  to  be  scheduled.  We  take  into  account  the 
deadline ,  laxity  and  earliest  start  time ,  which  characterize 
the  most  important  properties  for  real-time  tasks  and  prece¬ 
dence  constraints,  to  actively  direct  the  searching  to  find  a 
feasible  schedule.  The  potentially  heuristic  functions  of  H 
are:  (1)  Minimum  deadline  first:  H(T )  =  MinJD',  (2) 
Minimum  earliest-start-time  first:  H(T)  =  MinJD',  (3) 
Minimum  laxity  first:  H(T)  =  Min-L  =  min(Di  —  ( Ei  + 
Ci ));  (4)  H(T)  =  MinJD  +  W  x  Min.E\  (5)  H(T)  = 
MinJD+W xMinJj',  (6 )H(T)  =  Min.E+W  x  MinM 
where  IT  is  the  weight  factor  to  adjust  the  effect  of  different 
temporal  characteristics  of  tasks. 

The  search  attempts  to  determine  a  feasible  schedule  for 
a  set  of  tasks  in  the  following  way.  It  starts  with  an  empty 
partial  schedule  as  the  root,  and  tries  to  extend  the  sched¬ 
ule  with  one  more  task  by  moving  to  one  of  the  vertices  at 
the  next  level  in  the  search  tree  until  a  feasible  schedule  is 
derived.  The  heuristic  function  H  is  applied  to  each  of  the 
remaining  unscheduled  tasks  at  each  level  of  the  tree.  The 
task  with  the  smallest  value  is  selected  to  extend  the  current 
partial  schedule.  Because  MinJD  considers  the  precedence 
constraints,  it  performs  better  than  other  simple  heuristics 
in  our  experiments.  The  simulation  studies  also  show  that 
(MinJD  +  W  x  MinJD)  has  superior  performance. 

4  Simulated  Results 

To  study  the  features  of  the  proposed  algorithms,  we  con¬ 
ducted  several  experiments  to  evaluate  the  allocation  heuris¬ 
tics  with  regards  to  schedulability.  How  to  use  it  for  an  ac¬ 
tual  robotics  application  is  discussed  in  Section  5.  Tasks 
generated  in  a  directed  acyclic  graph  have  the  following 
characteristics: 

•  The  computation  time  Ci  of  each  task  Tj  is  uniformly 
distributed  between  Cmin  and  Cmax  set  to  10  and  60 


time  units,  respectively.  The  communication  cost  lies 
in  the  range  (CR  x  Cmj„,  CR  x  Cmax),  where  CR 
is  the  Communication  Ratio  used  to  assign  communi¬ 
cation  costs.  Experiments  were  conducted  with  CR 
values  between  0. 1  and  0.4. 

•  To  address  harmonicity  relationships,  we  set  a  period 
range,  (minP?  ,maxPj),  for  each  input  task  Tj  (task 
without  incoming  edges),  and  (1  ,maxPj>)  for  each 
output  task  Tj  (task  without  outgoing  edges),  where 
minPf  =  Lower  x  Ci  and  maxP ’/  =  Upper  x  Ci, 
Lower  =  1.1  and  Upper  =  4.0.  To  ensure  that  the 
periods  of  output  tasks  are  no  less  than  those  of  input 
tasks,  a  parameter,  mult-factor  is  used  to  set  the  up¬ 
per  bound  of  the  period  for  output  task  Tj :  maxP®  = 
mult-f  actor  x  max  (max Pf),  where  Tj  are  input 
tasks  and  mult-factor  is  randomly  chosen  between  1 
and  5.  In  order  to  make  periods  harmonic,  first,  we  pro¬ 
cess  input  tasks  and  make  their  periods  harmonic;  then 
we  tailor  the  techniques  from  [20]  to  process  output 
tasks;  finally,  we  use  the  GCD  technique  for  intermedi¬ 
ate  tasks  to  achieve  harmonicity  constraints.  The  idea 
of  computing  GCD  is  to  do  a  backward  period  assign¬ 
ment:  a  task  Tj.  gets  period  If  from  all  its  successors 
so  that  Pk  =  GCD{Pi\Pi  G  succ(Tk)}.  Because  of 
precedence  constraints,  periods  of  output  tasks  cannot 
be  considered  separately  from  those  of  input  tasks,  so 
Pi  ,  which  is  the  least  period  of  output  tasks,  is  cal¬ 
culated  upon  the  largest  period  of  input  tasks  iPfn  ), 
Pi  =  \maxPi  / Ph J  x  Pfn.  Other  output  tasks’  pe¬ 
riods  are  computed  upon  PfJ  to  achieve  harmonicity. 

•  Parameter  out-degree  is  used  to  set  the  precedence 
relationships  in  terms  of  data  processed  by  multiple 
producers/consumers.  For  each  task,  except  for  out¬ 
put  tasks,  the  out-degree  is  randomly  chosen  be¬ 
tween  1  and  3.  The  total  number  of  tasks  in  a 
task  set  is:  4  x  tasksetsize-f actor,  where  3  < 
tasksetsize-f actor  <  8,  and  all  the  results  shown 
here  are  for  task  sets  with  four  input  and  four  output 
tasks,  though  we  have  conducted  experiments  with  dif¬ 
ferent  numbers. 

All  the  simulation  results  shown  in  this  section  are  ob¬ 
tained  from  the  average  value  of  10  simulation  runs.  For 
each  run,  we  generate  100  test  sets,  each  set  satisfying 
Y^i=i(Ci/ Pf)  <  m,  where  n  is  the  number  of  tasks  and 
m  is  the  number  of  processors.  For  a  given  task  set,  if 
this  condition  is  not  held,  at  least  one  processor  utilization 
will  be  larger  than  1 .  The  scheme  used  here  is  to  remove 
the  task  sets  that  are  definitely  infeasible.  Obviously,  this 
does  not  eliminate  all  infeasible  task  sets  because  the  pres¬ 
ence  of  communication  costs  are  not  considered.  However, 
since  feasibility  determination  is  intractable,  if  one  heuris¬ 
tic  scheme  is  able  to  determine  a  feasible  schedule  while 
another  cannot,  we  can  conclude  that  the  former  is  superior. 


Therefore,  the  performance  of  the  algorithms  and  parameter 
settings  are  compared  using  the  Success  Ratio  (SR): 

JSTSUCC 

SR=— 

Nsucc  js  plc  totaj  number  of  schedulable  task  sets  found 
by  the  algorithm,  and  N  is  the  total  number  of  task  sets 
tested.  Here  N  is  100  for  each  simulation  run,  and  for  each 
result  point  in  the  graphs,  SR  =  (Xd=i  SRi)/ 10,  where 
SRi  =  N8UCC/100. 

The  tests  involved  a  system  with  2  to  12  processors  con¬ 
nected  by  a  multiple-access  network.  Resources  other  than 
CPUs  and  the  communication  network  were  not  considered. 
Whereas  we  study  the  algorithm  under  various  parameter 
settings,  due  to  space  limitation,  here  we  only  show  some 
of  the  salient  results  and  conclusions.  Details  that  are  not 
reported  here  can  be  found  in  [13]. 

4.1  Choosing  a  Scheduling  Heuristic 

In  order  to  eliminate  bias  from  the  scheduling  search 
heuristics,  we  first  examine  which  heuristic  function  is  suit¬ 
able  to  evaluate  the  allocation  algorithms  in  terms  of  Suc¬ 
cess  Ratio  (SR)  of  schedulability  for  a  fixed  task  set  size. 

For  both  Greedy  and  Aggressive,  we  found  MinJS  is  the 
best  simple  heuristic.  This  is  because  the  earliest  start  time 
of  each  instance  of  a  task  encodes  the  basic  precedence  in¬ 
formation.  For  integrated  heuristics,  MinJD  +  W  *  MinJS 
has  substantially  better  performance  than  other  heuristics 
including  MinJS.  The  reason  should  be  clear:  besides 
precedence  constraints,  another  important  factor,  deadline , 
is  also  taken  into  account. 

Since  Min.D  +  W  *  MinJS  is  a  weighted  combination 
of  simple  heuristics,  we  investigate  its  sensitivity  to  changes 
of  weight  (W)  values.  When  W  =  0,  the  heuristic  becomes 
the  simple  heuristic  Min.D.  and  does  not  perform  well. 
When  the  weight  increases  from  0  to  4,  or  from  0  to  12  when 
//Proc  =  2,  we  see  a  significant  performance  increase  for 
various  -j/Proc  values.  The  algorithm  is  robust  with  respect 
to  heuristics,  as  performance  is  affected  only  slightly  when 
the  weight  varies  from  4  to  30  (or  12  to  30  if  j/Proc  =  2). 
So  we  will  choose  W  =  4  for  following  experiments.  Here¬ 
after,  we  denote  “The  Number  of  Processors'’  as  “#Proc”, 
and  “The  Number  of  Tasks  Within  a  Set ”  as  “ #Task ”. 

4.2  Performance  of  the  Allocation  Algorithm 

In  this  section,  we  evaluate  the  performance  of  the  allo¬ 
cation  algorithms.  Greedy  and  Aggressive,  compared  with 
another  method,  random  allocation.  Figure  4  illustrates  the 
results  for  task  set  size  of  12,  20  and  32,  respectively,  when 
CR  =  0.1.  As  shown  in  the  graphs,  for  each  instance,  the 
performance  of  Aggressive  is  better  than  that  of  Greedy, 
which  is  in  turn  better  than  the  random  allocation.  The 
gains  come  from  the  elimination  of  communication  while 


maintaining  minimal  utilization  for  each  processor.  Since 
Aggressive  takes  into  account  the  utilization  bound  at  each 
assignment  step,  and  tries  to  cluster  as  many  tasks  as  possi¬ 
ble,  so  as  to  eliminate  the  total  communication  cost,  it  is  not 
surprising  that  it  achieves  better  performance  than  Greedy, 
which  considers  only  the  individual  communication  cost  for 
a  given  site. 

The  other  observation  is  that  the  improvements  in  per¬ 
formance  of  Greedy  or  Aggressive  with  CR  =  0.4  is  larger 
than  the  improvement  with  CR  =  0.1,  especially  when  the 
task  set  size  is  large,  say,  no  less  than  20.  Table  5  shows 
the  difference  in  improvement  of  Greedy  to  the  random;  Ta¬ 
ble  6  is  for  the  improvement  of  Aggressive  to  the  random. 
Both  are  with  //Task  =  20  and  //Task  =  32,  respec¬ 
tively.  As  we  can  see,  in  most  cases,  the  improvements  with 
CR  =  0.4  are  much  larger  than  those  with  CR  =  0.1  for 
either  the  Greedy  or  the  Aggressive  heuristic.  The  reason 
is  that  when  CR  =  0.4,  the  communication  costs  intro¬ 
duce  more  workload  into  the  system,  and  hence  increase  the 
resource  contention.  So  communication  costs  dictate  the 
schedulability  much  more  than  the  case  when  CR  =  0.1. 
In  contrast  to  random  assignment,  our  approaches  exploit 
this  important  property  to  direct  the  allocation  assignment, 
hence,  they  work  better  in  a  resource-tight  environment. 

Finally,  we  found  that  as  the  number  of  processors  in¬ 
creases,  the  improvements  for  both  Greedy  and  Aggressive 
become  less  for  a  given  task  set.  This  observation  shows 
that  the  tighter  the  resource  constraint,  the  better  our  algo¬ 
rithms  perform. 
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Table  5.  Improvement  of  Greedy  over  random 
approach  (Percentage) 

4.3  Effect  of  Communication 

Due  to  space  limit,  we  omit  details  of  the  effect  of  com¬ 
munication  costs  that  may  be  found  in  [13].  Our  results 
show  that  when  the  number  of  processors  is  very  limited, 
e.g.,  2  or  3,  the  performance  is  almost  the  same.  This  is 
because  in  such  a  situation,  it  is  hard  to  find  a  feasible 
schedule  for  both  cases.  But  as  the  number  of  processors 
increases,  the  performance  for  CR  =  0.1  is  better  than  that 
for  CR  =  0.4.  This  is  because  each  communication  intro- 
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Figure  4.  Effect  of  allocation  algorithms  (  CR  =  0.1  ) 
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#Tasfe  =  20  Table  7.  WCETs  (ms)  of  tasks  in  Figure  1 
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Table  6.  Improvement  of  Aggressive  over  ran¬ 
dom  approach  (Percentage) 

duces  additional  precedence  constraints  that  will  affect  the 
earliest  start  time  of  consumers.  The  communication  costs 
also  affect  the  overall  system  workload,  therefore,  lowering 
communication  leads  to  improved  schedulability. 

5  Application  of  Our  Algorithms  to  Mobile 
Robotics 

In  this  section,  we  return  to  the  robotic  problem  dis¬ 
cussed  in  Section  1,  where  two  strategies,  push  and  pull, 
are  given  for  a  team  of  two  robots.  In  Table  7,  the  WCET 
of  tasks  are  taken  from  an  experimental  implementation  on 
a  Strong  ARM  206MHz  CPU;  in  Table  9,  communication 
costs  are  based  on  the  bytes  transmitted  using  802. 1  lb  wire¬ 
less  protocol  with  11  Mbit/s  transmission  rate.  Although 
802.11b  does  not  allow  for  real-time  transmission  guaran¬ 
tees,  by  prescheduling  communications,  medium  contention 
is  avoided.  The  periods  are  assigned  with  220  ms  for  all 
sensor  tasks  and  motor  drivers  by  the  application.  There¬ 
fore,  the  periods  of  controller  tasks  are  also  designed  to  be 
220  ms  by  the  harmonic  constraint.  Though  these  figures 
are  given  based  on  tasks  in  Figure  1,  they  are  compatible  to 
tasks  that  occur  with  more  robots;  let  us  consider  the  sce¬ 
nario  when  a  third  robot  wants  to  join  the  team.  Since  the 
push  and  pull  controllers  are  pairwise,  there  are  four  strate¬ 
gies,  composed  of  push  and  pull  controllers,  that  the  appli¬ 


cation  can  use  for  a  team  of  three  robots:  {Pull,  Pull}, 
{Pull,  Push},  {Push,  Pull},  and  {Push,  Push}.  The 
task  graphs  for  these  four  strategies  are  shown  in  Figures  5. 

First,  let  us  use  the  Aggressive  heuristic  to  analyze  the 
tasks’  locations  in  each  strategy.  The  details  of  the  alloca¬ 
tion  steps  are  omitted  here  due  to  space  limitations.  In  this 
example,  since  the  accumulated  communication  cost  is  con¬ 
sidered,  the  allocation  is  the  same  for  all  strategies:  Hi  is 
assigned  to  >S’i,  H2  to  So.  H3  to  S3,  L2  to  So  and  L?>  to  S3. 

Next,  the  algorithm  will  see  which  strategies  are  schedu- 
lable  using  the  heuristic  MinJD  +  W  x  MinJS.  To  sim¬ 
plify  the  analysis,  here  W  is  set  to  1.  The  completion  times 
for  tasks  on  each  site  are  shown  in  Table  8.  The  algorithm 
finds  that,  except  for  {Push,  Pull},  denoted  as  {ph,pl}, 
all  other  strategies  are  feasible,  but  with  different  comple¬ 
tion  times  (including  communication  delay).  Since  multi¬ 
ple  strategies  are  feasible,  the  application  can  use  some  cri¬ 
teria  to  rank  the  strategies.  In  this  case,  if  the  total  laxity 
is  used  as  the  criterion,  then  the  application  will  choose  the 
{Pull,  Push}  strategy,  which  has  the  maximal  value. 

The  application  can  then  use  the  feasible  results  when 
computing  new  sets  of  strategies.  For  example,  if  at  some 
time  a  fourth  robot  joins  the  team,  the  application  immedi¬ 
ately  knows  that  any  strategy  that  contains  {Push,  Pull} 
will  not  be  feasible,  since  that  strategy  was  already  deter¬ 
mined  to  be  infeasible.  Therefore,  the  application  can  use 
the  feasibility  analysis  to  prune  infeasible  strategies  as  the 
team’s  size  scales. 
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Table  8.  The  completion  time  for  all  strategies 


Figure  5.  Four  possible  strategies  for  a  team  of  three  robots  using  the  push  and  pull  controllers 
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Table  9.  Communication  costs  of  Figure  1 


6  Related  Work 

Numerous  research  results  have  demonstrated  the  com¬ 
plexity  of  design  for  real-time  system,  especially  with  re¬ 
spect  to  temporal  constraints  [9,  19,  20,  21,  22].  Also  the 
schedulability  analysis  for  distributed  real-time  systems  at¬ 
tracted  a  lot  of  attention  in  recent  years  [12,  15,  16,  28].  For 
tasks  with  temporal  constraints,  researchers  have  focused  on 
generating  task  attributes,  e.g.,  period,  deadline  and  phase. 
For  example,  Gerber  et  al.  [9,  22]  proposed  the  period 
calibration  technique  to  derive  periods  and  related  dead¬ 
lines  and  release  times  from  given  end-to-end  constraints. 
Techniques  for  deriving  system-level  constraintsfrom  per¬ 
formance  requirements  are  proposed  by  Seto  et  al.  [23,  24]. 
When  end-to-end  constraints  are  transformed  into  interme¬ 
diate  task  constraints,  most  previous  research  results  are 
based  on  the  assumption  that  task  allocation  has  been  done 
a  priori.  However,  schedulability  is  clearly  affected  by  both 
the  temporal  characteristics  and  the  allocation  of  tasks. 

For  a  set  of  independent  periodic  tasks,  Liu  and  Lay- 
land  [14]  first  developed  the  feasible  workload  condition  for 
schedulability  analysis  under  uniprocessor  environments. 
Much  later,  Baruah  et  al.  [5]  presented  necessary  and  suf¬ 
ficient  conditions,  namely,  U  <  n  (n  is  the  number  of 
processors)  based  on  P-fairness  scheduling  for  multiproces¬ 
sors.  Also,  the  upper  bounds  of  workload  specified  for  the 
given  schedules,  e.g.,  EDF  and  RMA,  are  derived  for  ho¬ 
mogeneous  or  heterogeneous  multiprocessor  environments 
[2,  4,  6,  10,  11,  26].  All  these  techniques  are  for  preemptive 
tasks  and  task  or  job  migrations  are  assumed  to  be  permit¬ 
ted  without  any  penalty.  If  precedence  and  communication 
constraints  exist,  these  results  cannot  be  directly  used. 

Peng  et  al.  [17],  Abdelzaher  et  al.  [1]  and  Ramamritham 
[18]  studied  the  task  allocation  and  scheduling  problem  in  a 
distributed  environment.  In  their  models,  subtasks  or  mod¬ 
ules  of  a  task  can  have  precedence  and  communication  con¬ 
straints.  From  this  perspective,  their  work  comes  closest  to 
ours.  By  using  a  branch-and-bound  search  algorithm  [17], 


the  optimal  solution  in  the  sense  of  minimizing  maximum 
normalized  task  response  time  is  found  to  the  problem  of 
allocating  communicating  periodic  tasks  to  heterogeneous 
processing  nodes.  Though  the  heuristic  guides  the  algo¬ 
rithm  efficiently  toward  an  optimal  solution,  the  algorithm 
cannot  be  simply  applied  and  extended  to  our  environment. 
The  major  differences  are:  1)  applications  require  that  the 
decision  be  made  on-line;  2)  we  consider  a  non-preemptive 
schedule  which  is  NP-hard  in  the  strong  sense  even  with¬ 
out  precedence  constraints  [8],  while  the  algorithm  [3]  used 
in  their  method  is  to  find  a  preemptive  schedule;  3)  the 
precedence  constraints  are  predetermined  among  specific 
instances  of  tasks  in  their  algorithm,  while  in  our  approach, 
this  is  accomplished  by  the  scheduling  subject  to  the  prece¬ 
dence  constraints.  In  [1],  a  period-based  method  is  proposed 
to  the  problem  of  load  partitioning  and  assignment  for  large 
distributed  real-time  applications.  Scalability  is  achieved  by 
utilizing  a  recursive  divide-and-conquer  technique.  [18]  dis¬ 
cussed  a  static  algorithm  for  allocating  and  scheduling  com¬ 
ponents  of  periodic  tasks  across  sites  in  distributed  systems. 
How  to  allocating  replicates  is  a  major  issue  counted  in  the 
algorithm.  Our  task  allocation  and  scheduling  algorithm, 
however,  focuses  on  the  improvement  of  schedulability  by: 
1)  using  a  dynamic  increasing  threshold  to  bound  the  uti¬ 
lization  bound  along  each  allocation  step;  2)  consider  the 
precedence  constraints  as  early  as  possible  by  setting  the 
earliest  start  time  into  the  heuristic  scheduling  function. 

7  Conclusion  and  Future  Direction 

Allocating  and  scheduling  of  real-time  tasks  in  a  dis¬ 
tributed  environment  is  a  difficult  problem.  The  algorithms 
discussed  in  this  paper  provide  a  framework  for  allocating 
and  scheduling  periodic  tasks  with  precedence  and  commu¬ 
nication  constraints  in  a  distributed  dynamic  environment, 
such  as  mobile  robotic  system. 

Our  algorithm  was  applied  to  a  real  world  example  from 


mobile  robotics  to  achieve  a  simple  but  efficient  allocation 
and  scheduling  scheme  for  a  team  of  robots.  We  believe 
that  this  approach  can  enable  system  developers  to  design  a 
predictable  distributed  embedded  system,  even  if  there  are  a 
variety  of  temporal  and  resource  constraints. 

Now  we  discuss  some  of  the  possible  extensions  to  the 
algorithm.  First,  if  the  system  design  does  not  have  pre¬ 
allocated  tasks,  the  heuristic  is  still  applicable.  In  this  case, 
the  initial  threshold  is  0.  After  selecting  the  first  pair  of 
communicating  tasks  and  randomly  assigning  them  to  a  pro¬ 
cessor,  the  algorithm  can  continue  to  work  on  remaining 
tasks  as  discussed  in  the  original  algorithms. 

Second,  the  algorithm  can  be  tailored  to  apply  to  hetero¬ 
geneous  systems.  If  processors  are  not  identical,  the  exe¬ 
cution  time  of  a  task  could  be  different  if  it  runs  on  differ¬ 
ent  sites.  To  apply  our  approach  in  such  an  environment, 
first,  we  can  take  the  worst  case  communication  cost  ratio, 
which  is  calculated  by  the  slowest  processors  for  each  pair 
of  communicating  tasks,  and  then  we  can  use  these  values  as 
estimates  to  choose  the  task  to  be  considered  next.  Second, 
when  we  select  the  processor,  if  the  task  can  be  assigned 
to  the  processor  that  the  producer  is  on,  then  we  are  done; 
otherwise,  we  need  to  consider  the  utilization  and  the  speed 
of  a  processor  the  same  time,  e.g.,  compare  the  utilizations 
from  the  fastest  processors  to  see  which  processor  will  have 
the  least  utilization  after  loading  the  task,  and  choose  the 
one  with  the  minimum  value.  After  assigning  each  task,  the 
threshold  will  change  in  a  way  similar  to  the  original  algo¬ 
rithm. 
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